TPUG - Toronto PET Users Group

home *** CD-ROM | disk | FTP | other *** search

/ TPUG - Toronto PET Users Group / TPUG Users Group CD / TPUG Users Group CD.iso / PET / S-Super PET / (s)t7.d64 / WS.DESCRIPTION < prev

Wrap

Text File | 2009-01-18 | 20KB | 316 lines

ASP : A STATISTICAL PACKAGE =========================== The programs on this disk comprise a statistical analysis package. Complete documentation, sample problems etc. can be found in 'Computing in Statistical Science Through APL' by Francis Anscombe, published by Springer-Verlag, New York Inc. 1981. Two conventions should be noted: 1) Abcissas are always mentioned before ordinates (as with arguments of 'regrinit' and 'scatterplot'). 2) When numerical vectors are stacked in a matrix, they are the columns (as with the result of 'jacobi' and the argument of 'downplot'. * * * * * Let us look at the workspaces and ther contents. UTILITY: enter: Enter data for storage in an array called matrix. store: Save data to disk. print: Print output to ieee4 printer. test: Test carriage control. read: Get file from disk. listfile: List a disk file. MULT/REGR: Multiple regression by stages and examination of residuals. x regrinit y: Is used once at the outset to setup global variables for 'regr'. the first argument X is a matrix whose columns list values of the independ- ent variables. Usually one column is all 1's and its index number is the argument in the first call of 'regr'. Each column of X should have been multiplied by powers of 10 so that the unit place is the last significant one. The second argument Y is either a vector listing values of one dependent variable or a matrix whose columns list values of several dependent variables. Just as with X, the unit place should be the last significant one. regr l: performs regression on one or more designated independent variables. The argument L is a scalar or vector listing the index no.(s) of the independent variables(s) to be brought into the next regression. show v: may be used at any stage to obtain summary information about a vector. The argument V is a vector, such as RY (if a vector) or a column of RY (if a matrix) or a column of RX or DIAGQ. The frequency distribution in the output is over six intervals of equal length, The 1st and 6th are centered on the least and greatest values occurring in V. 'show' refers to no global variables, and may be used outside this regression context. stres: gives standardized residuals of the dependent variable(s) for use in scatterplots. No arguments. Should be used only after 'regr' has been executed. variance: yields the conventional estimated variance matrix of the regression coefficients. No arguments, use only after 'regr' has been executed. n sample cp: sample of size N from a distribution over non-negative integers having cumulative probabilities CP APLTESTS: Tests on residuals after least squares linear regression. test i: Tests for distribution shape, heteroscendasticity, nonadditivity, and serial correlation, are carried out on residuals from a fitted regression relation. Several Global variables from 'regr' are needed. Special cases of the tests appear in 'rowcol' and 'summarize'. The argument I is either 1 or 2, controlling which subsidiary function is used for computing moments of the test Statistics. (*This routine appears to be missing from the disk) tests1: Calculates exact 2nd and third moments. tests2: Calculates exact 2nd and approximate third moments. ccd s: Carries out a complete cholesky decomposition. REG/PLOT x cor y: Calculate correlation coefficient between vectors X and Y. stdize x: The vector X is rescaled to have zero mean and unit variance. rlogistic n: N random logistic deviates, mean 0, variance 1. rnormal n: N random normal deviates by box-muller method. p quantiles v: Quantiles of the vector V for given proportions P x fit y: Causes information to be displayed about the means of the variable and the regression coefficient, together with a conventional estimated standard error for the latter, calculated as though the errors were independent. nif x: Gives the hastings approximation to the normal integral function. X may be any numerical array. u scatterplot v: Used for displaying corresponding members of two vectors. U and V are vectors of equal length. Corresponding members U[j] and V[j] are ploted as abscissa and ordinate of a a point. u tscp v: A tripple scatterplot in which a third dimension is suggested by varying the symbols used in plotting the points. downplot v: Is used for plotting members of one or more vectors against their index numbers. The argument V is either a vector or a matrix, if V is a vector, V[j] is plotted against j. Otherwise, V[j:] is plotted against j. z tdp v: a tripple downplot in which the third dimension is suggested by varying the symbols used in plotting the points. The Z argument is a character scaler or vector or matrix, indicating the symbols to be used in plotting the points. if z is scalar, the same symbol is used every time. TABLES summarize y: Provides summary statistics of a data set. The argument Y may be any numerical array having at least 4 members. Measures of location, scale, shape of distribution are displayed. rowcol y: performs a standard analysis of variance on row-column cross- classification, along with an additive analysis of a two-way table, with tests on residuals. rowcolpermute: permutes global variables RE CE and RY, use after 'rowcol' and before 'rowcoldisplay'. Puts CE in ascending and RE in descending order. rowcoldisplay i: Is a special function substituting for 'tscp', that may be used to display the output of 'rowcol'. Abcissas and ordinates are the column effects and the row effects. The argument I is the change in column effect represented by a unit horizontal displacement. analyze y: Begins the analysis of variance of a perfect rectangular array. effect v: Estimates a designated main effect or a designated interaction. The argument V is a list of one or more coordinate numbers, just one for a main effect, two or more for an interaction. mp x: This routine, median polish, fits an additive structure to a two-way table by repeatably subtracting medians of rows and medians of columns. n bartlett s: Bartlett's test for homogeneity of variances. The second argument is a vector of unbiased variance estimates. The first argument is the degrees of freedom, either the common value for all the variance estimates or a vector of values, one for each variance estimate. Box's approximation by the f distribution and Bartlett's original chi-square approximation. CONTINGENCY: Analysis of contingency tables. contingency x: Applies to two dimensional contingency tables and performs a chi-square test of association, with display of standard residuals. The argument X must be a matrix of non-negative numbers, having no zero marginal totals. fourfold x: Also applies to two dimensional tables and is applied when the categories of each classification are ordered. Empirical log crossproduct (fourfold) ratios are displayed. A Plackett distribution is fitted and goodness of fit is tested by chi-squared, with display of standardized residuals. The argument X is the same as in 'contingency'. multipoly x: Applies to contingency tables in any number of dimensions finding empirical log crossproduct ratios analagous to those of 'fourfold' for two dimensions. The argument X must be an array of non-negative integers in 2 or more dimensions. v pool x: Also applies to multi-dimensional arrays, pooling categories in any table. The second argument X is an array in 2 or more dimensions, typically a contingency table or a table of expected frequencies. The first argument V is a vector with at least 3 elements. V[1] specifies the coordinate and 1(down arrow)V specifies the index values, over which there is to be pooling. Sections of X corresponding to index values 2(down arrow)X are added to the section with index value v[2], and then the former sections are deleted. Thus if X is a matrix with 5 rows and V is 1 4 5 1 , pooling will be over the 1st coordinate (rows); The contents of rows 4 5 and 1 will be added, placed in row 4 and the rows 5 and 1 will be dropped, so the result Y has three rows. lfact x: Calculates the factorials of X. n csif x: The tabulated chi-squared integral function. inif p: Odeh-evans approximation to the inverse of the normal integral function ctg2 x: A function similar to 'contingency' except that the probability-of-the- sample statistic and also the likelihood-ratio statistic are calculated instead of Pearson's chi squared. Calls 'lfact', 'csif' and 'inif'. FUNCTIONS x max y: Maximization of a function of one variable. The arguments are vectors of length 3, the first having no two members equal. The explicit result is the coordinates of the vertex of a parabola with vertical axis, that goes through the three points whose abscissas are the first argument and ordinates the second argument. h integrate a: Integration of one dimensional definite integrals. The first argument H, a scalar, is step size. The second argument A is a vector of 2 or limits of integration, in ascending order, with differences all divisible by H. The explicit result Z is a vector of length 1 less than the length of A, listing the definite integrals from A[1] to each of the other members of A. The function to be integrated is asked for, and must be expressed in terms of an argument X(local variable). For example, to integrate 'sin x' from 0 to each of 1 2 3 and compare the result with '1 - cos x': 0.1 integrate -1+1 2 3 4 then enter 1ox and 1-2o1 2 3 inif p: See 'CONTINGENCY' nif x: See 'REG/PLOT' TIME: cw r: Calculates weights for a cosine-weighted moving average of length R. m filter x: For filtering time series. The first argument M is a pair of integers defining the filter. The second argument X is a vector of data to be filtered. Filtering consists of subtracting a cosine-weighted moving average of extent M[2] from a similar moving average of extent M[1]. The resulting weights are displayed, together with their sum of squares. The elements of M must be either both odd or both even, or else one of them must be 0. w mav x: Moving average or filtering of a series X with weights W. A moving average with arbitrary weights is taken of either one vector or several vectors simultaneously. The first argument W is the vector of weights. The second argument X is either a vector of data or a matrix whose columns are the vectors of data to be averaged. The result U is either a vector or a matrix. k taper u: Fourier analysis of time series. The first argument K is a positive integer and the scond U is a vector of length greater tha 2xK. d fft v: Fast Fourier transform. The first argument D is scalar, either 0 1 or 2. The second argument V is a three dimensional array. If D is 0, the function yields the complex Fourier transform of a single complex time series. if D is 1, the function yields the real Fourier transform of a single real series; the transform is scaled to give a direct analysis of variance. When D is 2 the function yields simultaneously the real trans- forms of two real series, each scaled as when D is 1. polar s: Is used to transform the output of 'fft' to polar form. The argument S is a matrix with 2 columns, such as the result of 'fft' when D is 1. w ma x: Moving average or filtering of a series X with weights W. *** The next five routines carry out a harmonic regression of a time series on one or more other time series. prehar: Generates phase-difference plots between pairs of series. The data is passed to 'prehar' in the global variable FT, a three dimensional array. harinit u: Initializes for the remaining functions. b har1 v: Performs harmonic regression of a dependent series on one predictor series. b har1r v: Generates the residual series after execution of 'har1'. b har2 v: Performs harmonic regression of a dependent series on two predictor series. tsnt u: Carries out a normality test on a time series. The argument is a vector of innovations. mardia x: Mardia's multivariate kurtosis test. end : Invoked by 'tsnt' and 'mardia' GAIN f gain w: Gain of a linear filter. If the weights are symmetric, the lag is taken to be constant, equal to 0.5x-1+pw, and the gain may be negative. if the weights are unsymetric the lag is not computed and g is merely the magnitude of the gain. f gainlag w: Is used to calculate lag and gain when W is not symmetric. F is supposed to be in increasing order, and the gain should not vanish at any member of F; 'gain' may be run to verify this. Consecutive members of F should be close enough together for phase changes to be small. n autocov v: Calculates the first N serial correlations of V. x fit y: See REG/PLOT PLACKETT p1 pd2 p2: The explicit result is a two dimensional plackett distribution having discrete marginal distributions with probabilities listed in the arguments. Each argument must consist of at least two positive integers summing to 1. One global variable must be defined before execution, the natural logarithm of the fourfold crossproduct ratio of probabilities. pd3: Requires 7 global variables to be defined before execution: P1, P2 and P3 list probabilities in the discrete univariate marginal distributions; L12, L13 and L23 give the log fourfold ratios in the bivariate marginal distributions; TOL is a positive tolerance such as 0.001. x biv y: Invoked by 'pd2' and 'pd3'. a pp b: Finds the product of two polynomials. collect a: Used with 'pp'; collects like terms in a polynomial. n isotropy l: Test of sphericity of a multivariate normal distribution. The first argument N is the number of degrees of freedom in the variance matrix. The second argument L is a vector of positive roots of the variance matrix. c jacobi x: Characteristic roots and vectors of a symetric matrix. The second argument X is the given matrix, to be transformed towards diagonal form by Jacobi's method. The first argument C, a positive scalar, is the tolerance for off-diagonal elements in the transformed matrix. REG7: Regression when errors have a type 7 (or 2) distribution. t7init: Initializes. t7lf k: Evaluates the marginal likelihood function L at a trial set of values of the regression parameters, after integration of the likelihood with respect to the scale parameter and the shape parameter. The argument K specifies the order of derivatives needed, 0 for L, 1 for L and DL, 2 for L, DL,and D2L. The argument should be 2 at first execution. t7s q: Estimates the change in the regression parameters needed to reach the maximum of the marginal likelihood. The argument Q should be not less than 3/n. t7a: Invoked by 't7lf'. t7b: Invoked by 't7lf'. HOUSEHOLDER: Regression by Householder tranformations, uncorrelated residuals. x hht y: Arguments as per 'regrinit', except that the last significant digit in all columns of X that have observational error should be in the same place, but not necessarily in the unit place. HUBER: Robust regression r huber z: The function 'huber' performs one cycle of iteration towards minimizing the sum of a function rho of the residuals in a regression problem, where rho is defined in terms of a positive constant K. The first argument R must be scalar and either -1 or between 0 and 1. The second argument Z is the array of residuals corresponding to a trial setting of the regression parameter 'beta', which must be pre-specified. huber1: Is invoked when Z is a vector. There must be a global matrix X whose columns are the independent variables in a regression. huber2: Is invoked when z is a matrix. Z must have at least 3 rows and 3 columns. The usual additive structure is fitted. ASPDATA: Contains various test files used in the book. print: Prints a variable: i.e. print varname. enter: As in UTILITY. **** list of variables in 'aspdata' loblollydata: Average heights of loblolly pines in feet. enrol: Enrolment at yale university 1796-1975. imports: Imports of merchandise 1790-1975. year: a list of year numbers 1796 to 1975. butter: Wholesale price of butter at New York (cents/pound) 1830-1975. housing: Total nember of new housing units started (thousands) 1889-1975. se70: School expenditures , 51 points of data, one for each state. pi68: Personal incomes, 51 points of data, one for each state. y69: Young persons, same as 'se70'. urban70: Proportion urban population, same as 'se70'. m1: White-Eisenberg table, stomach cancer site and blood group. m2: Graunt table, sex of christenings in london and country. m3: francis table, summary of 1954 poliomyelitis vaccine trial. m4: Kiser-Whelpton table, education of wife and fertility planning. m5: Gilby table, clothing and intelligence rating of schoolboys. m6: Stuart table, men's distance vision in right and left eyes. m7: Glass table, social mobility, father's status and son's status. y349: A matrix of values on page 349. x350: A matrix of values on page 350. num: scalar=80; rec: scalar=81; vars: vector=80 1; vector: scalar=60 **************************** GENERAL COMMENTS *********************** The Functions described above work together using many global variables. However, it appears that they can be organized into three groupings; namely 1) analyze cor downplot effect fit quantiles regr regrinit rowcol rowcoldisplay rowcolpermute scatterplot show stdize stres summarize variance 2) autocov contingency cw fft filter fourfold harinit har1 har1r har2 huber huber1 huber2 ma mav multipoly polar pool prehar taper tdp tscp 3) bartlett biv ccd collect csif ctg2 end gain gainlag hht inif integrate isotropy jacobi lfact mardia max mp nif pd2 pd3 pp rlogic rnormal sample tests tests1 tests2 tsnt t7a t7b t7init t7lf t7s as the author stored them in three seperate workspaces on his system. * * * * * print/basic: This is a Waterloo basic program to print an apl data file. * * * * * DISCLAIMER: This Waterloo text file has been added to this disk for your use and as encouragement to try the statistical functions. I am not a statistician and it is quite possible I have misinterpreted the meaning or significance of some function or variable. Bill Dutfield November 11/83 * * * * * A THANK YOU! I did not enter the apl functions and I would like to thank that person (unknown to me) who did. The effort he put into entering and checking these functions was time consuming, but I believe worthwhile not only for himself but for us. I would like to thank Roger Green for bringing them to the attention of the club and for making them available to us, the Superpet group of TPUG (Toronto Pet Users Group). ***** End of File *****